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I. INTRODUCTION 


With the rapid technological changes in data storage 
and processing, managers and administrators have been 147.0 
changing the way they make decisions. They have relied È 
less on their intuitions and more on data. This change has 
become necessary, as Jetzek et al. (2014) point out, 
because of the myriad possibilities for creating, collecting, 
and storing data in our increasingly digital world. 


Data Volume in zet 


According to data from Statista (2022) through 2021, and no 
estimates from 2022 through 2025, the growth in the a 26,0 | 
v aanl | 


creation, capture, and consumption of information and data 
is evident, as presented in Fig. 1. 18 2019 2020 2021 2022 


In this sense, Brynjolfsson and McElheran (2016) Fig. 1: Volume of data/information created, captured, 
conducted a systematic empirical study regarding the copied, and consumed worldwide from 2010 to 2025 
diffusion of what has been termed Data-Driven Decision- adapted from (Statista, 2022) 

Making (DDD). 

At the industrial level, DDD has been primarily In the scope of Public Administration, typically the 
concentrated on the following characteristics: (i) large- owner of large volumes of data and information, 
scale companies, (ii) owning and using information information technologies are increasingly used and, 
technology, (iii) having skilled workers, and (iv) despite the different realities and specificities of each state 


significant levels of awareness. 
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or nation, there are trained professionals in greater or 
lesser numbers. In addition, there is a movement toward 
making government data available, known as Open 
Government Data (OGD). 


According to Matheus ef al. (2020) efforts in this 
direction can result in more democracy, greater 
administrative efficiency, transparency, accountability, 
collaboration, engagement, and trust in government. In 
addition, it can also potentially result in the generation of 
considerable economic and social value. However, 
according to Jetzek et al. (2014), there is still a lack of 
understanding of how this can happen indicating the need 
for greater attention and further exploration of the topic. 


Birchall (2015) states that OGD is part of a necessary 
component of the new "data economy." To participate and 
gain benefits from the so-called “infocapitalism” 
democracy, where the data subject is called to be: (i) 
auditor who monitors granular state transactions in the 
name of accountability, (ii) entrepreneur who makes data 
profitable through apps and visualizations, and (iii) 
consumer of these apps and visualizations. 


In this growing context of data and information comes 
Data Science which is intrinsically intertwined with other 
concepts of growing importance such as big data, artificial 
intelligence, and the DDD. This perspective provides a 
framework and principles that allow the manager to 
systematically address problems to extract useful 
knowledge from data and thus make more assertive 
decisions (Provost & Fawcett, 2013). 


Data scientists in the government context need not only 
a solid knowledge of statistics and data analysis, the use of 
techniques and tools for predictive purposes and for 
visualizing results. But also an understanding of other 
elements such as policy-making, organization, legislation, 
and public values. This combined knowledge allows the 
data to be placed in context and to understand its use and 
the implications involved (Matheus et al., 2020). 


Il. LITERATURE REVIEW 


DDD, as a new paradigm, emerges from digitalization 
and networks and is based on a new and valuable resource, 
data. Thus, new practices have been spreading rapidly 
among companies regardless of organization sizes, as well 
as in Public Administration (Klingenberg et al., 2019). 


Provost e Fawcett (2013) define DDD as the practice of 
basing decisions on data analysis rather than purely on 
intuition. They also emphasize that it is not an all-or- 
nothing practice, meaning that it can be used to a greater or 
lesser degree within organizations. 
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In this context, there is also the movement called 
"Open Government" or "Open Data", defined by Kassen 
(2013a), as one in which government data is available for 
use and distribution by anyone without any copyright 
restrictions. Thus, the dichotomous world (Market and 
State) has been transformed into an open and 
interconnected world in which the traditional roles and 
relationships between these agents are being replaced by 
complex interdependencies. Therefore, the production and 
use of these data for decision-making and their availability 
by public authorities become even more significant to the 
extent that citizens and public and private organizations 
have, not only the opportunity but also the motivation and 
ability to use data to achieve social and economic value 
(Jetzek et al., 2014). 


Brynjolfsson et al. (2011) statistically showed that the 
more a company is data-driven, the more productive it is, 
and it can achieve gains of around 4% to 6%. They also 
highlight the correlation (almost causal) with a higher 
return on assets, equity, asset utilization, and market value. 
Similarly, they point out that productivity increases in the 
context of Public Administration when DDD is used, with 
gains of around 5% to 6% beyond what can be explained 
by traditional inputs and the use of IT. 


Data Science has supported and increasingly 
overlapped with DDD through automated computational 
systems. Whether it is decisions for which discoveries 
need to be made within the data or decisions that are 
repeated especially on a large scale. Another critical aspect 
is the support of analytical thinking from data, the reason 
is that this skill is important for both data scientists and 
employees across the organization. For it allows one to 
understand the fundamental concepts and have frameworks 
to organize analytical thinking. It not only enables 
interaction with competence but also in visualizing 
opportunities to improve DDD or to see data-driven 
competitive threats. However, investments in analytics can 
be useless, and even harmful, unless employees can also 
incorporate that data into complex decision-making. 
Therefore, for Data Science to flourish as a field, it must 
think beyond the commonly used algorithms, techniques, 
and tools. It needs to think about the elementary principles 
and concepts that underlie the techniques and the 
systematic thinking that promotes success in DDD. The 
success desired in the DDD business environment requires 
the ability to think about how the fundamental concepts 
apply to specific problems and businesses (analytical 
thinking) (Provost & Fawcett, 2013). 


Public value is another related concept in the OGD and 
e-government literature. The public value framework is 
based on the premise that public resources should be used 
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to increase value, not only in the economic sense but also 
more broadly in terms of what is valued by citizens. 


2.1. The methodological framework of this study 


This is an exploratory study as regards its purpose and 
bibliographical as regards the means of investigation 
because it is a systematic study developed based on 
published articles (Vergara, 2004). 


According to Broadus (1987) bibliometrics is a type of 
quantitative and statistical bibliographic research that 
originated in Information Science. However, this study is 
also qualitative because the data obtained will be analyzed 
according to interests, delimitations, and criteria defined 
by the authors. 


Thus, to obtain a set of bibliographic references about 
data-driven decision-making in the context of public 
administration; and from this portfolio which is the 
articles, authors, and prominent journals dealing with this 
theme, the structured procedure called Knowledge 
Development Process - Constructivist (ProKnow-C) was 
used. 


The ProKnow-C framework starts by considering the 
researcher's interest in a theme, as well as some 
delimitations and restrictions that help him, in a structured 
way, to select and analyze relevant articles. According to 
Ensslin et al (2010) the concept of bibliometric analysis is 
based on the quantitative evidencing of the parameters of a 
selected set of articles: the selected articles themselves, 
their sets of bibliographic references, authors, relevant 
journals, and the number of citations.. 


The next section presents the methodological 
procedures used in the search, collection, selection, and 
analysis of relevant publications related to the theme under 
study. 


MW. METHODOLOGICAL PROCEDURES 


Kitchenham (2004) summarizes that systematic 
reviews are means to assess and interpret relevant research 
available for a research question, thematic area, or 
phenomenon of interest. Among the main motivations for 
studies of this nature, he highlights the possibility of (i) 
synthesizing evidence concerning treatment or technology 
to summarize, for example, empirical evidence of the 
benefits and limitations of a specific method; (ii) 
identifying gaps in current research in order to suggest 
areas for further investigation; (iii) providing a framework 
to adequately position new research activities and; (iv) 
assessing the extent to which empirical evidence may 
support or contradict theoretical hypotheses, or assist in 
the generation of new hypotheses. 
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Karlsson (2009), regarding the use of systematic 
reviews, highlights (i) the scientific support when basing 
work on relevant publications; (ii) justify the choice of a 
theme and the consequent contribution of a research 
proposal; (iii) substantiate the methodological framework 
of the research; (iv) by delimiting the scope of research, 
the researcher, makes it feasible and; (v) allows the 
researcher to develop his analytical capacity of the 
information and criticism of the specific literature. 


3.1. The filters 


Thus, as to the procedures adopted in this study, the 
procedures described in the sequence were carried out in 
the months of May and June 2022. 


Two databases were selected as sample fields. The base 
Web of Science (or ISI) gives rise to the JCR index 
(Journal Citation Report) that evaluates the impact factor 
of journals and the base Scopus (Elsevier) which currently 
holds the title of the largest database of scientific articles 
in the world. 


The first filter for the selection of articles was the 
choice of keywords grouped into two thematic axes: "data- 
driven" and "public sector". The keywords and their 
respective synonyms initially selected relative to each axis 
are presented in Table 1. 


Table.1: Selected keyword combinations 
N° Axis 1 Axis 2 
1 "Public*" 
"Public Admin*" 
"Public Sec*" 


ee n "Public Serv*" 
“data-driven*” 
"Public Manage*" 


"Govern*" 
"Open Gov*" 
"Open Data*" 


o y Dn A UU N 


Synonyms and wildcard characters were used so as not 
to restrict too much the search results in the databases. The 
searches with these words were carried out only in the 
titles, keywords defined by the authors of the articles, and 
in the abstract of the articles ("TOPIC" selection in the 
search fields of the databases). 


Only articles (type: "ARTICLE") published in the last 
10 years were selected, that is, published from 2012 until 
June 2022, when this research was carried out. 


The areas of interest selected in each base are presented 
in Table 2. 


Table.2: Selected areas of interest on each basis 
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Web of Science Scopus 


“Business”, 
“Management and 
Accounting”; 
“Economics”, 


“Mathematics Interdisciplinary 
Applications”, “Management”, 
“Business”, “Business 
Finance”, “Economics”, 


“Interdisciplinary “Econometrics and 
Applications”, “Public Finance”, “Decision 

Administration”, Sciences”; “Social 

“Management”, Sciences” e 


“Multidisciplinary Sciences”, 
“Social Sciences Mathematical 
Methods” e “Social Sciences 
Interdisciplinary” 


“Multidisciplinary”’. 


These were the preliminary filters adopted in the search 
for each keyword combination in each database. The next 
section will present the results and respective analyses 
conducted. 


IV. PRESENTATION AND DISCUSSION OF 
RESULTS 


The search results for each keyword combination, on 
each basis, are shown in Table 3. 


Table.3: Database search results 


Web of 


Combinations Science Scopus 

“data-driven*” AND "Public*" 264 888 

“data-driven*” AND "Public 21 18 
Admin*" 

“data-driven*” AND "Public Sec*" 19 41 

“data-driven*” AND "Public Serv*" 12 41 

“data-driven*” AND "Public 9 11 
Manage*" 

“data-driven*” AND "Govern*" 600 615 
“data-driven*” AND "Open Gov*" 9 20 
“data-driven*” AND "Open Data*" 31 83 

Total 965 1.717 


From the total of articles obtained in each base, 745 
duplicate articles were identified, resulting in a total of 
1.937 distinct articles. 


The next step was to read the titles of the articles in 
search of articles aligned with the theme of interest. After 
this step, 1.633 articles were excluded, i.e., 304 were 
aligned with the proposed theme. 
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We then proceeded to analyze the scientific recognition 
of these 304 articles. For this, using the Google Scholar 
(GS) tool, the number of citations of each article was 
obtained. 


As a cut-off criterion, the articles that represent around 
80% of the total number of citations (8.599) were selected. 
Thus, of the 304 articles aligned by title 56 (or 18.43% of 
the total) concentrated 80,044% of the citations. In other 
words, the articles that received at least 38 citations were 
selected. 


The 248 less cited articles will still be evaluated 
according to other criteria, and some may still be part of 
the final portfolio of articles selected as part of the 
theoretical framework of the research. 


With the articles with the greatest scientific 
recognition, they were evaluated as to the alignment of the 
abstract with the theme of interest. In this process, 11 non- 
aligned articles were eliminated. 


Thus, 45 articles remained that were aligned as to the 
title and the abstract, which presented a relevant quantity 
of citations 


The 248 articles with few or no citations were also 
evaluated according to the following criteria: (i) articles 
published less than 2 years before the analysis, since there 
was not enough time to be cited yet; and (ii) when 
published more than 2 years before, being from 
researchers who are already among the authors of the 45 
articles selected so far. 


Among the 248 articles under review, 174 were 
published in 2020, 2021, or 2022. And among the 74 
articles with a publication date of more than 2 years, only 
1 was by an author present in the bibliographic portfolio. 


After reading the abstracts of these 175 articles, 6 were 
selected based on their alignment with the research 
objective. 


Thus, these 6 articles were added to the set of 45 
previously selected for further reading. After the full 
reading of the 51 articles, 7 were excluded for being 
misaligned with the research theme, resulting in a set of 44 
articles for the final portfolio. 


In summary, the results obtained in the first two stages 
of the framework (Search and Selection) are shown in Fig. 
1. And the results of the bibliometric analysis itself 
(Analysis) will be presented in the next section. 
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= Scopus Web of Science 
2 (n= 1717) (n = 965) 
Š 

Papers Exclusion of 745 
Nn ‘9 

(n = 1937) duplicate articles 
Alignment by article’s title “gas eae oo ca Poe 

(n = 304) unconfirmed scientific recognition 
= (n = 248) 
& Alignment regarding scientific 
= recognition 
g n Z 56) 
Q Elo > 
= Alignment by abstract, 
a Alignment by article’s abstract recognized authors or published 

(n = 45) Jess than 2 years ago 

(n = 6) 
Full-text aligned articles 
(n = 44) 
Portfolio article references 

wn Portfolio articles, authors, (n = 2480) 
a journals and keywords 
= (n= 44) Articles, authors and 
z journals 


Presentation of portfolio 
highlights: journals, articles 
and authors. 


Fig. 1: Main steps of ProKnow-C framework adapted 
from Lacerda et al. (2016) 


The 44 articles in the portfolio are presented in 
alphabetical order of the first author in Table 4. 


Table.4: Articles from the bibliographic portfolio 
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‘ ° of citations 
N° Article w i pie a 
1. (Provost & Fawcett, 2013) 1.477 
2. (Williamson, 2016) 413 
3. (Kassen, 2013) 362 
4. (Chakraborty & Ghosh, 2020) 293 
Bb (Jetzek et al., 2014) 248 
6. (Liang et al., 2018) 218 
Ta (Bansak et al., 2018) 210 
8. (Barns, 2018) 205 
9. (Elish & Boyd, 2018) 196 
10. (Matheus & Janssen, 2020) 149 
11. (Parycek et al., 2014) 137 
12. (Phillips-Wren & Hoskisson, 2015) 121 
13. (Chen et al., 2017) 90 
14. (Appelbaum et al., 2018) 88 
15. (Khalifa et al., 2014) 81 
16. (Birchall, 2015) 76 
17. (Batarseh & Latif, 2016) 73 
18. (Tenney & Sieber, 2016) 72 
19. (Klingenberg et al., 2019) 69 
20. (McBride et al., 2019) 66 
Pui (Katsonis & Botros, 2015) 61 
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N° of citations 
N° Article GS 
22. (Kassen, 2017) 59 
23. (Choi et al., 2018) 58 
24. (Hino et al., 2018) 53 
25; (Gupta & Rani, 2019) 51 
26. (Moro Visconti & Morea, 2019) 49 
27. (Poel et al., 2018) 48 
28. (Waheed et al., 2018) 48 
29. (Marda, 2018) 47 
30. (Matheus et al., 2020) 43 
31. (van Oort et al., 2015) 43 
32. (Toufaily et al., 2021) 42 
33. (Kassen, 2018) 41 
34. (Lourenço et al., 2017) 41 
35. (French, 2014) 39 
36. (Dencik et al., 2019) 38 
37. (Hummel et al., 2021) 38 
38. (Severo et al., 2016) 38 
39. (M. Janssen et al., 2022) 18 
40. (Pereira et al., 2018) 14 
41. (Kassen, 2020) 6 
42. (Kim et al., 2019) 6 
43. (Chen & Ji, 2022) 0 
44. (Cheung & Chen, 2021) 0 


4.1. Bibliometric analysis of the bibliographic portfolio 


This section is dedicated to the bibliometric analysis of 
the selected portfolio to build a theoretical framework for 
data-driven decision-making in the context of Public 
Administration. The results will be presented in three 
stages: (i) a bibliometric analysis of the articles selected; 
(ii) a bibliometric analysis of the references of the articles 
in the portfolio; and (iii) the classification of the articles 
according to their relevance to the scientific community. 


From the bibliometric analysis, the journals with the 
largest number of articles are shown in Fig. 2. 
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N° of articles in the portfolio 


Fig. 2: Journals with the highest number of articles in the 
portfolio 


The journals “Policy and Internet” and “Government 
Information Quarterly” presented 3 articles, each one, 
among those selected for the portfolio. The other 
periodicals (Annals of Operations Research, Australian 
Journal of Public Administration, Behaviour and 
Information Technology, Big Data, Big Data and Society, 
Big Data Research Chaos Solitons & Fractals City 
Culture and Society, Communication Monographs, 
European Journal of Social Theory, Information & 
Management, Information Technology and People, 
International Journal of Disaster Risk Reduction, 
International Journal of Electronic Government Research, 
Internet Policy Review, Journal of Accounting Literature, 
Journal of Decision Systems, Journal of Education Policy, 
Journal of Information Science, Journal of Manufacturing 
Technology Management, Law and Social Inquiry, Nature 
Sustainability, Philosophical Transactions of the Royal 
Society a-Mathematical Physical and Engineering 
Sciences, Public Performance & Management Review, 
Public Transport, Science, Social Science Computer 
Review, Surveillance and Society, Transforming 
Government: People Process and Policy, Transportation 
Research Part D: Transport and Environment, Urban 
Education e Urban Planning), which appear, indicated 
with *** in Fig. 2, contributed only 1 article each. 


The authors who stood out within the portfolio with the 
highest number of articles are shown in Fig. 3. 
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N° of articles in the portfolio 


KASSEN M 4 
MATHEUS R }3 
JANSSEN M 3 
CHEN Y 3 
PARYCEK P j 2 
CHOI Y 2 


Fig. 3: Authors with the highest number of articles in their 
portfolio 


Researcher Maxat Kassen from Nazarbayev University 
(Kazakhstan) had 4 papers selected for the final portfolio, 
followed by Ricardo Matheus, Marijn Janssen, and Chen 
Yang with 3 papers, and Peter Parycek and Youngseok 
Choi with 2 papers each. The remaining 101 authors had 
only one of their papers selected. 

As for scientific recognition, by the number of citations 
in GS, the articles are presented in descending order in Fig. 
4. 


Total Citations 


Fig. 4: Portfolio articles are ordered by the number of 
citations 


The most prominent article, with 1,477 citations, is the 
one by Foster Provost and Tom Fawcett entitled “Data 
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Science and its Relationship to Big Data and Data-Driven 
Decision Making”. 


In time, within the final portfolio, the occurrences of 
keywords indicated by the authors were analyzed. The 
results are presented in Fig. 5. 


Jo 


of occurrences 


big data 
open data m 9 
emment S 7 
government SE 6 
smart cities Es 5 
decision making Smmm 5 
transparency SE 4 
open govemment data SEmmmmm 4 
data-driven decision-making a 4 


artificial inte 


14 


open g 


predictive analyt 
government SEEE 3 
governance Smm 3 
e-democracy SE 3 
analytics me 3 
accountability a 3 


cs 


twitter S 2 

social media mumm 2 

public sector innovation Smmm 2 
public sector SE 2 
participation Smm 2 

open innovation Smmm 2 
machine lcaming mmm 2 
innovation SE 2 

freedom of information mamm 2 
forecasting Emm 2 
digitization ==] 2 

data == 2 

coronavirus SEE 2 
collaboration === 2 

civic c ement Sm 2 
bibliometrics Bay 2 

a 


algorithms === 


Fig. 5: Keywords most frequently used by authors in the 
articles in the portfolio 


We identified 33 keywords cited at least twice by the 
authors. The most frequent was “big data? with 14 
mentions, followed by “open data” with 9 mentions and 
“open government” with 7, and “smart cities” and 
“decision-making” with 5 mentions each. In addition to 
these, another 156 keywords were mentioned only once by 
the authors of the portfolio, as can be seen qualitatively in 
Fig 6. 


ia = SS redictive ana analytes er 


— Sauer iranisparcnicy severnancé==—.. 
=F government= =" 
a government. 


SEER Stagers analytics 
ae = ata: gama 
“government ACIEM BRANDES 1 are 
core ae lata 
~=—===open government data ==- 
on tiicial intelligence accountability zeree m 
== decision-makingzzizcs= 
Fig. 6: Wordcloud with the author's keywords 
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4.2. Bibliometric analysis of references from the 
bibliographic portfolio 


From the 44 articles in the portfolio, 2.480 different 
references were obtained. As for the analysis of the most 
frequent journals in the portfolio references, the results are 
shown in Fig. 7. 


N° of Articles 


GOVERNMENT INFORMATION QUARTERLY Ba) 29 
INFORMATION POLITY EE 15 
POLICY & INTERNET EEE 1! 
BIG DATA & SOCIETY EEE 11 
PUBLIC ADMINISTRATION REVIEW EEE 10 
OPEN GOVT PUBLIC ADM EE 8 
HARVARD BUSINESS REVIEW EE 8 
PUBLIC PERFORMANCE & MANAGEMENT... SE) 7 
MIS QUARTERLY EE 7 
PUBLIC ADMINISTRATION HD 6 
SCIENCE EE 5 
INFORMATION SYSTEMS MANAGEMENT EE 5 


INFORMATION & MANAGEMENT EE 5 


ACCOUNT REV E 5 
STRATEGIC MANAGEMENT JOURNAL HEHH) 4 
NATURE HD 4 
BIG DATA JOURNAL HD 4 
AMERICAN REVIEW OF PUBLIC.. IEEJ 4 
ACADEMY OF MANAGEMENT REVIEW HM 4 
SURVEILLANCE & SOCIETY 
RESEARCH POLICY 
POLIT ANALYSIS 
PLOS ONE 
OECD DIGITAL ECONOMIC PAPERS 


MANAGE SCIENCE 


JOURNAL ORGANIZATIONAL COMPUTING 
JOURNAL OF COMMUNITY INFORMATICS 
JOURNAL OF ACCOUNTANCY 

INFORMATION, COMMUNICATION & SOCIETY 


CUUUUUUUUUN 


BUSINESS & INFORMATION SYSTEMS 


Fig. 7: Most Relevant Periodicals in the Portfolio 
References 


These are the 30 most prominent journals among the 
references found in the portfolio. The most prominent 
journal is “Government Information Quarterly” with 29 
occurrences, followed by “Information Polity? with 15 
occurrences and “Policy & Internet” and “Big Data” with 
11 occurrences each. 


Among the 3.374 authors cited by the articles in the 
portfolio, 34 authors stood out with 6 or more citations. 
These authors are shown in Figure 8. 
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o e . x 
N° of Citations 
JANSSEN, M 38 
ZUIDERWIJK, A 19 
KITCHIN, R 19 
BERTOT, J. CSRS 


IRAN, ZL ee 1) 

KASSEN, M. ees 10 

CHARALABIDIS, Y. aa 10 
VAN OORT, N. S 9 
JAEGER, P.T. SE 9 
HELBIG, N. SEE 9 
GRIMMELIKHUIJSEN, S. mmummumm 9 
LYON, D. EED 8 
LOURENCO, R.P. mu 8 
CRAWFORD, K. SE g8 
CHEN, Y. = 8 

WEERAKKODY, V. mum 7 


PASQUALE, F. == 7 
MAYER-SCHNBERGER, V. mmm 7 
MARGETTS, H. me 7 
HARRISON, T.M. mm 7 

CHEN, H. mmm 7 


BRYNJOLFSSON, E. mumm 7 
BOYD, D. == 7 
BAROCAS, S. === 7 
BANNISTER, F. == 7 
SCHROEDER, R. mEn 6 
PROVOST, F. me 6 
OBAMA, B. SEn 6 

MORO VISCONTI, R. E 6 


MATHEUS, R. mD 6 
MANYIKA, J. ED 6 

LEE, J. EE 6 
DAVENPORT, T.H. === 6 


CUKIER, K. S 6 


Fig. 8: Most cited authors among the references of the 
articles in the bibliographic portfolio 


The most prominent author in the portfolio references 
is Marijn Janssen from the Delft University of Technology 
with 38 citations, followed by Anneke Zuiderwijk and Rob 
Kitchin with 19 citations each, and John Bertot with 12 
citations. 


The 20 most prominent articles (number of citations in 
the GS in the portfolio references are shown in Table. 5. 


Table.5: Most prominent articles among those cited in the 
portfolio references 
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N° Article a 
1. (Boyd & Crawford, 2012) 5.189 
25 (Albino et al., 2015) 3.255 
3. (Bertot et al., 2010) 2.846 
4. (Kitchin, 2014b) 2.675 
5. (Kitchin, 2014a) 2.666 
6. (Kitchin, 2014c) 2.279 
Ws (M. Janssen et al., 2012) 2.053 
8. (Burrell, 2016) 1.514 
9. (Butler, 2013) 645 
10. (Gurstein, 2011) 616 
11. (K. Janssen, 2011) 380 
12. (Kassen, 2013) 378 
www.ijaers.com 


N° Article a 
13. (Lourenço, 2015) 318 
14. (Gonzalez-Zapata & Heeks, 2015) 259 
15. (M. Janssen & Zuiderwijk, 2014) 211 
16. (Peled, 2011) 200 
17. (Clarke & Margetts, 2014) 186 
18. (M. Janssen & Kuk, 2016) 83 
19. (Barocas, Solon; Selbst, Andrew D, 2016) 60 
20. (Kashin et al., 2015) 7 


The article “Critical Questions for Big Data” by Danah 
Boyd and Kate Crawford, both contributors at Microsoft 
Research, was the most cited in GS among the 2,480 
articles in the bibliographic references in the portfolio. 


Among the 109 authors of the portfolio presented, in 
Fig. 9, the 39 authors presented at least 2 articles in the 
portfolio and 1 in the portfolio references. 


N° of articles in the Bibliographical 
Portfolio and in the References 


JANSSEN M = 22 
VANNON Er "10 
IRANIZ Er e 9 

KASSEN, M. E 7 

MOROVR ET = 6 


LOURENOR Er 6 
BOYDD ÆT 6 
SIEBERR Er = 5 

KALVETT Er 5 
WILLIAMSONB Er 4 

MCBRIDEK Ær 4 
ELISHM === 4 

ALJOHANIN Er 4 

VASARHELYIM E” 3 
TOOTSM == 3 
SCHROEDER R ET 3 
PROVOSTF Æ 3 
KRIMMERR ET 3 
HASSANS E 3 

GOVERDER ET 3 

FAWCETTT === 3 

BATARSEH F Æ 3 

ANTUNES) === 3 

PARYCEKP == 3 DN? of articles in the references 

MATHEUSR eS 

PIOTROWSKIS E? 2 @ N° of articles in the portfolio 
PHILLIPS-WRENG EF? 2 

PEREIRAG E7? 2 

MARDA V f=} 2 

LEEH =} 2 

JETZEKT = 2 
HARTOGM = 2 
FRENCH M =} 2 
BRANDST =} 2 
BARNSS =} 2 

AVITAL M al 2 
ARDILA-GOMEZ A f=} 2 
APPELBAUMD = 2 
CHOLY Ea 2 


Fig. 9: Authors with the highest number of articles in the 
bibliographic portfolio and the references in the 
bibliographic portfolio 
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Again, the author Marijn Janssen appears as the leading 
author among the references with 22 articles. Niels van 
Oort, also from the Delft University of Technology, with 
10 articles, and Zahir Irani, from the Business School of 
Brunel University, with 9 articles. And Maxat Kassen, the 
most prominent author in the portfolio (with 4 articles, Fig. 
3) had 7 articles cited in the references. 


From the bibliometric analysis, the most relevant 
journals and articles in academia can be evidenced through 
the combined analysis between the journals where the 
articles in the portfolio were published and the journals in 
the references, as shown in Fig. 10. 


Qı ' Q2 


aphic portfolio 


bib! 


N° of articles in the 
e 


N* of articles from the bibliographic portfolio references 


Fig. 10: Top journals in the bibliographic portfolio and 
references 


Fig. 10 was divided into 4 quadrants, in quadrant Q1 
we observe the prominent journals in the portfolio (“ASLIB 
Journal Information 
“Surveillance & Society” e “Sustainability”) all with 2 
articles each in the bibliographic portfolio. In Q2 are the 
periodicals that stand out in the portfolio and in the 


Management”, “Plos One”, 


portfolio references (“Government Information Quarterly” 
and “Policy & Internet”), which together have almost 45% 
of the citations in the portfolio references and 3 articles 
each in the bibliographic portfolio. In Q; the journals that 
stood out in the portfolio references (“Big Data & 
Society”, “Information & Management’, “Information 
Polity”, “Public Performance & Management Review” and 
“Science”,) with at least 5 citations. And in Q4 the relevant 
periodicals in the portfolio and the references of the 
portfolio (“Australian Journal of Public Administration”, 
“Big Data Journal”, “International Journal of Accounting 
Information Systems” and “Philosophical Transactions of 
the Royal Society A-Mathematical Physical and 
Engineering Sciences’) with one article in the portfolio 
each and less than 5 citations in the references of the 
bibliographic portfolio. 

When analyzing the scientific relevance of the articles 
(obtained by the number of citations of each article) and 
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the incidence of articles by the same author in the 
bibliographic portfolio references, we obtained the scatter 
plot shown in Fig. 11. 


Q4 Q3 


Fig. 11: Top articles and authors in the bibliographic 
portfolio 


Fig. 11 was also divided into 4 quadrants, in quadrant 
Qı are the top articles in terms of citations in GS (Kitchin, 
2014b) and (Kitchin, 2014c) both with more than 2,000 
citations. In Qz are the prominent articles performed by 
prominent authors (Bertot et al., 2010), (M. Janssen et al., 
2012) and (Kitchin, 2014a) that got more than 2,000 
citations in the GS and more than 2 citations in the 
portfolio. In Q3 are articles by prominent authors (Bertot et 
al., 2012), (Dawes & Helbig, 2010), (M. Janssen & Kuk, 
2016), (M. Janssen & Zuiderwijk, 2014), (Kitchin et al., 
2015), (Weerakkody et al., 2017), (Zuiderwijk et al., 2012) 
and (Zuiderwijk & Janssen, 2014) all with 2 citations in 
the references and less than 1000 citations in the GS. And 
in Q; the articles relevant to the topic (Jaeger & Bertot, 
2010), (Kitchin & Dodge, 2011), (Kitchin & Lauriault, 
2014), (Kitchin, 2017), (Kitchin, 2015) and (Zuiderwijk et 
al., 2014). 


V. CONCLUSION 


The objective of this study focuses on the use of a 
systematized procedure to select relevant articles to 
compose a theoretical framework about DDD in the 
context of Public Administration, given the relevance and 
timeliness of the topic and the economic and social 
impacts. 


This study initially presented the procedures for 
searching and selecting relevant articles and an analysis to 
assess the main works, authors, and journals that have 
been published on the topic. As summarized in Fig. 1, 
using the ProKnow-C framework, from an initial volume 
of 1,937 articles we obtained a bibliographic portfolio 
composed of 44 articles presented in Table 4. 
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In addition to the article selection process, which aims 
to compose a theoretical referential on the theme, a 
bibliometric analysis was carried out. It was possible to 
highlight the journals “Policy and Internet” and 
“Government Information Quarterly’ as the most 
prominent in terms of publications on the theme. 


As for the authors, the framework evidenced the 
contributions of Maxat Kassen, Ricardo Matheus, Marijn 
Janssen, and Chen Yang, with at least 3 papers each. 


Furthermore, from the analysis of the bibliographic 
references of the articles in the portfolio, it was verified 
the relevance of the journals “Government Information 


Quarterly”, “Information Polity”, “Policy & Internet” and 
“Big Data”. 


Thus, the use of data and new Big Data and Artificial 
Intelligence technologies and the creation of transparency 
through OGD initiatives are key areas of research on 
DDD. This systematic review allowed us to verify the 
increase in the production of studies related to open data, 
transparency, and the use of new technologies to treat data, 
classify, and group data, helping the public manager to 
obtain insights and make decisions with the help of 
technical and quantitative elements. 


However, we emphasize that the results presented are 
limited to the sample of journals and articles researched 
because they cannot be extrapolated to the entire set of 
publications in an area. 


As a suggestion for further and future work, we 
recommend the application of the next stage of the 
ProKnow-C framework, which proposes a systemic 
content analysis of this bibliographic portfolio. 
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